16 research outputs found
Bayesian nonparametric models for peak identification in MALDI-TOF mass spectroscopy
We present a novel nonparametric Bayesian approach based on L\'{e}vy Adaptive
Regression Kernels (LARK) to model spectral data arising from MALDI-TOF (Matrix
Assisted Laser Desorption Ionization Time-of-Flight) mass spectrometry. This
model-based approach provides identification and quantification of proteins
through model parameters that are directly interpretable as the number of
proteins, mass and abundance of proteins and peak resolution, while having the
ability to adapt to unknown smoothness as in wavelet based methods. Informative
prior distributions on resolution are key to distinguishing true peaks from
background noise and resolving broad peaks into individual peaks for multiple
protein species. Posterior distributions are obtained using a reversible jump
Markov chain Monte Carlo algorithm and provide inference about the number of
peaks (proteins), their masses and abundance. We show through simulation
studies that the procedure has desirable true-positive and false-discovery
rates. Finally, we illustrate the method on five example spectra: a blank
spectrum, a spectrum with only the matrix of a low-molecular-weight substance
used to embed target proteins, a spectrum with known proteins, and a single
spectrum and average of ten spectra from an individual lung cancer patient.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS450 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Nonparametric Models for Peak Identification and Quantification in MALDI-TOF Mass Spectroscopy
We present a novel nonparametric Bayesian model using Lévy random field priors for identifying the presence and abundance of proteins from mass spectrometry data. Informed prior distributions, based on expert opinion and on preliminary laboratory experiments, help distinguish true peaks from background noise and help resolve un-certainty about peak multiplicity
TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2
We present a novel nonparametric Bayesian approach based on Lévy Adaptive Regression Kernels (LARK) to model spectral data arising from MALDI-TOF (Matrix Assisted Laser Desorption Ionization Time-of-Flight) mass spectrometry. This model based approach provides identification and quantification of proteins though model parameters that are directly interpretable as the number of proteins, mass and abundance of proteins and peak resolution. Informed prior distributions, based on expert opinion and on preliminary laboratory experiments, help to distinguish true peaks from background noise and help resolve uncertainty about the peak multiplicity. Posterior distributions are obtained using a reversible jump Markov chain Monte Carlo algorithm and provide inference about the number of peaks (proteins), their masses and abundance. We show through simulation studies that the procedure has desirable true-and false-discovery rates. Finally, we illustrate the method on four example spectra: a blank spectrum, a spectrum with only the matrix of a low-molecular-weight substance used to embed target proteins, and a single spectrum and average of ten spectra from an individual lung cancer patient
Nonparametric models for proteomic peak identification and quantification. Bayesian Inference for Gene Expression and Proteomics
We present model-based inference for proteomic peak identification and quantification from mass spectroscopy data, focusing on nonparametric Bayesian models. Using experimental data generated from MALDI-TOF mass spectroscopy (Matrix Assisted Laser Desorption Ionization Time of Flight) we model observed intensities in spectra with a hierarchical nonparametric model for expected intensity as a function of time-of-flight. We express the unknown intensity function as a sum of kernel functions, a natural choice of basis functions for modelling spectral peaks. We discuss how to place prior distributions on the unknown functions using Lévy random fields and describe posterior inference via a reversible jump Markov chain Monte Carlo algorithm
Predicting station locations in bike-sharing systems using a proposed quality-of-service measurement: Methodology and case study
Bike-sharing systems (BSSs) operators tend to spend a great amount of time and effort to satisfy users. Accurately measuring the quality-of-service (QoS) of each station in a BSS will advance this mission. Moreover, measuring the QoS and using it to study the spatial dependencies in a BSS allows operators to better manage the system. The traditionally-known QoS measurement reported in the literature is based on the proportion of problematic stations, which are defined as those with no bikes or docks available to users. The authors investigated the traditionally-known QoS measurement, and it was found neither exposes the spatial dependencies between stations nor does it discriminate between stations in a BSS. This study proposes a novel QoS measurement, namely Optimal Occupancy that captures the impact of heterogeneity of bike-sharing systems (BSSs) and reflect the spatial dependencies between the stations. Optimal Occupancy is defined as the ratio of the total time a station is functional during a given interval to the length of the interval, in which it also redefines problematic stations. The authors applied geo-statistics to explore the spatial configuration of Optimal Occupancy variations and model variograms for spatial prediction. Results revealed that the Optimal Occupancy is beneficial for operators, would result in better prediction of the QoS at nearby locations, and can be used to predict candidate spots for new stations in an existing BSS. For example, the proposed QoS for Station 50 was improved after adding a new nearby station, increasing QoS from 0.52 to 0.84 for a Monday and Tuesday of July, respectively.<br/
Network and station-level bike-sharing system prediction: a San Francisco bay area case study
The paper develops models for modeling the availability of bikes in the San Francisco Bay Area Bike Share System (BSS) applying machine learning at two levels: network and station. Investigating BSSs at the station-level is the full problem that would provide policymakers, planners, and operators with the needed level of details to make important choices and conclusions. We used Random Forest and Least-Squares Boosting as univariate regression algorithms to model the number of available bikes at the station-level. For the multivariate regression, we applied Partial Least-Squares Regression (PLSR) to reduce the needed prediction models and reproduce the spatiotemporal interactions in different stations in the system at the network-level. Although prediction errors were slightly lower in the case of univariate models, we found that the multivariate model results were promising for the network-level prediction, especially in systems where there are a relatively large number of stations that are spatially correlated. Moreover, results of the station-level analysis suggested that demographic information and other environmental variables were significant factors to model bikes in BSSs. We also demonstrated that the available bikes modeled at the station-level at time (Formula presented.) had a notable influence on the bike count models. Station neighbors and prediction horizon times were found to be significant predictors, with 15 minutes being the most effective prediction horizon time.</p
Athlétic-tribune sportive : hebdomadaire illustré / [directeurs Roger Roujean, Pierre Pradeu]
26 octobre 19331933/10/26 (A15,N708)-1933/10/26.Appartient à l’ensemble documentaire : Aquit
Effects of Bd exposure (white bars = No exposure to Bd; grey bars = Exposure to Bd) and bacterial treatment (normal, antibiotics, and augmented with <i>J</i>. <i>lividum</i>) on OTU richness (a), phylogenetic diversity (b), and relative abundance of the probiotic <i>J</i>. <i>lividum</i> (c) on bullfrog skin one week following exposure to Bd (day 7).
<p>Error bars represent standard error. * represent significant differences among treatments.</p
NMDS ordinations based on weighted UniFrac distance matrices (a,b) and Sorensen dissimilarity matrices (c,d) representing differences among microbiota manipulations in microbial community structure and metabolite profiles, respectively, of frogs exposed (a,c) and unexposed (b,d) to Bd one week after initial exposure to Bd.
<p>There were differences in microbial community structure and metabolite profiles among microbiota manipulation only with Bd exposure.</p
Conceptual model representing potential responses and interpretations of microbial community structure and function in the presence of a pathogen.
<p>Conceptual model representing potential responses and interpretations of microbial community structure and function in the presence of a pathogen.</p